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Abstract 

We present a prescriptive type system with parametric polymorphism and subtyping for 
constraint logic programs. The aim of this type system is to detect programming errors 
statically. It introduces a type discipline for constraint logic programs and modules, while 
maintaining the capabilities of performing the usual coercions between constraint domains, 
and of typing meta-programming predicates, thanks to the flexibility of subtyping. The 
property of subject reduction expresses the consistency of a prescriptive type system w.r.t. 
the execution model: if a program is "well-typed", then all derivations starting from a 
"well-typed" goal are again "well- typed" . That property is proved w.r.t. the abstract 
execution model of constraint programming which proceeds by accumulation of constraints 
only, and w.r.t. an enriched execution model with type constraints for substitutions. We 
describe our implementation of the system for type checking and type inference. We report 
our experimental results on type checking ISO-Prolog, the (constraint) libraries of Sicstus 
Prolog and other Prolog programs. 



1 Introduction 

The class CLP (A") of Constraint Logic Programming languages was introduced by 
Jaffar and Lassez ( | Jaffar fc Lassez,"l9 87) as a generalization of the innovative fea- 
tures introduced by Colmerauer in Prolog II ( |Colmerauer, 1984||Colmerauer, 1 985): 
namely computing in Prolog with other structures than the Herbrand terms, with 
inequality constraints and with co-routining. 

Inherited from the Prolog tradition, CLV(X) programs are untyped. Usually the 
structure of interest X is however a quite complex combination of basic struc- 
tures that may include integer arithmetic, real arithmetic, booleans, lists, Her- 
brand terms, infinite terms, etc. with implicit coercions between constraint do- 
mains like in Prolog IV ( |Colmerauer, 199 6). Even the early CLP (7Z) system of 
(Jaffar & Lass ez, 1987| ) already combines Herbrand terms with arithmetic expres- 
sions in a non-symmetrical way: any arithmetic expression may appear under a Her- 
brand function symbol, e.g. in a list, but not the other way around. The framework 
of many sorted logic in (Jaf far fc Lassez, 1987| ) is not adequate for representing the 
type system underlying such a combination, as it forces Herbrand function symbols 
to have a unique type (e.g. over reals or Herbrand terms), whereas Herbrand func- 
tions can be used polymorphically, e.g. in f (1) and f (f (1) ) , or the list constructor 
in a list of list of numbers [ [3] ] . 
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The type system of Mycroft-O'Keefe QMycroft fc O'Keefe, 1984||Lakshman fe Reddy, lMTj 
|Hill fc Topor, 1992| ) is an adaptation to logic programming of the first type system 
with parametric polymorphism, that was introduced by Damas-Milner for the func- 
tional programming language ML. In this system, types are first-order terms, type 
variables inside types, like a in list(a), express type parameters. Programs defined 
over a data structure of type list(a) can be used polymorphically over any homoge- 
neous list of elements of some type a. Such a type system for Prolog is implemented 
in the systems Godel ( |Hill fe Lloyd, 1994| | and Mercury ( |Somogyi et ai, 1996| | for 
example. The flexibility of parametric polymorphism is however by far insufficient 
to handle properly coercions between constraint domains, such as e.g. booleans 
as natural numbers, or lists as Herbrand terms, and does not support the meta- 
programming facilities of logic programming, with meta-predicates such as functor (X , F , N) , 
call(G) or setof (X,G,L). 

Semantically, a ground type represents a set of expressions. Subtyping makes 
type systems more expressive and flexible in that it allows to express inclusions 
among these sets. In this paper we investigate the use of subtyping for expressing 
coercions between constraint domains, and for typing meta-programming predi- 
cates. The idea is that by allowing subtype relations like list(a) < term, an atom 
like functor {[X\L], F, N) is well-typed with type declaration functor : term x 
atom x int — > pred, although its first argument is a list. Similarly, we can type 
call : pred — + pred, freeze : term x pred — > pred, setof : a x pred x list(a) — > pred. 
The absence of subtype relation list{a) ^ pred, has for effect to raise a type error 
if the call predicate is applied to a list. On the other hand, the subtype relation 
pred < term makes coercions possible from goals to terms. 

Most type systems with subtyping for logic programming languages that have 
been proposed are descriptive type systems, i.e. their purpose is to describe the 
success set of the program, they require that a type for a predicate upper approx- 
imates its denotation. On the other hand, in prescriptive type systems, types are 
syntactic objects defined by the user to express the intended use of function and 
predicate symbols in programs. Note that the distinction between descriptive and 
prescriptive type systems is orthogonal to the distinction between type checking 
and type inference which are possible in both approaches. 

There are only few works considering prescriptive type systems for logic programs 
with subtyping ( |Beierle, 199"5l|Dietrich fc Hagl, 1988||Hanus, 1992||Hill fc Topor, 1992| 
|Yardeni et al, 199 2 Smolka, 1988). In these systems however, subtype relations be- 
tween parametric type constructors of different arities, like list{a) < term, are not 
allowed, thus they cannot be used to type metaprogramming predicates and have 
not been designed for that purpose. The system Typical (Meyer, 1996) possesses 
an ad hoc mechanism for typing metapredicates which makes it quite difficult to 
use. Our objective is to propose a simple type system that allows for a uniform 
treatment of prescriptive typing issues in constraint logic programs. 

In a prescriptive type system, the property of subject reduction expresses the 
consistency of the type system w.r.t. the execution model: if a program is "well- 
typed" , then all derivations starting in a "well-typed" goal are again "well-typed" . 
This is a well-known result of the polymorphic type system without subtyping 
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JMycroft fc O'Keefe, 1984||Lakshman fc Reddy, 1991||Hill fc Topor, 1992| ) but when 
subtypes are added to the picture, the absence of a fixed data flow in logic programs 
makes the obtention of a similar result problematical. Beierle ( |Beierle, 1995| shows 
the existence of principal typings with subtype relations between basic types, and 
provides type inference algorithms, however Beierle and also Hanus ( |Hanus, 1992| ) 
do not claim subject reduction for the systems they propose. In general types are 
kept at run-time ( |Hanus, 1992||Yardeni et al, 1992| ) or modes are introduced to re- 
strict the data flow < |Dietrich k Hagl, 1988||Smaus et al., 2000||Somogyi et al, 1996| ). 

In this paper, by abstracting from particular structures as required in the CLP 
scheme, we study a prescriptive type system for CLP programs, that is independent 
from any specific constraint domain X. Sectional presents the type system that in- 
cludes parametric polymorphism and subtype relations between type constructors 
of different arities, in a quite general type structure of poset with suprema. We show 
two subject reductions results. One is relative to the abstract execution model of 
constraint programming, which proceeds only by accumulation of constraints. The 
proof of subject reduction holds independently of the computation domain, under 
the assumption that the type of predicates satisfies the definitional genericity prin- 
ciple flLakshman fc Reddy, 199l| ). The second subject reduction result is relative to 
the more concrete execution model of CLP with substitution steps. We show that 
for this second form it is necessary to keep at run-time the typing constraints on 
variables inside well-typed programs and queries. 

Section[3]describes the type checking algorithm and shows that the system of sub- 
type inequalities generated by the type checker are left-linear and acyclic. Section^ 
presents a linear time algorithm for solving left-linear and acyclic systems of subtype 
inequalities, and describes the cubic time algorithm of Pottier ( |Pottier, 2000a| ) for 
solving general systems of inequalities, under the additional assumption that the 
types form a lattice. Section [S] presents type inference algorithms for inferring the 
types of variables and predicates in program clauses. 

Section [S] describes our implementation which is available from ( |Coquery, 2000| ). 
The solving of subtype inequalities is done by an interface to the Wallace constraint- 
handling library ( |Pottier, "20 00b ) . In section |7| we report our experimental results 
on the use of this implementation to type check ISO-Prolog, the libraries of Sicstus 
Prolog, including constraint programming libraries, and other Prolog programs. 

2 Typed Constraint Logic Programs 

In this section we describe our type system as a logic for deriving type judgments 
about CLP programs. 

2.1 Types 

The type system we consider is based on a structure of partially ordered terms, 
called poterms, that we use for representing types with both parametric polymor- 
phism and subtype polymorphism. Poterms generalize first-order terms by the def- 
inition of a subsumption order based on function symbols, that comes in addition 
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to the instantiation preorder based on variables. Poterms are similar to order-sorted 
feature terms or tp-tevms flAit-Kaci fc Nasr, 1986||Smolka, 1988||Ait-Kaci et ai, 1997| ) 
but we find it more convenient here to adopt a term syntax (with matching by po- 
sition) instead of a record syntax (with matching by name) for denoting static 
types. 

The set of types T is the set of terms formed over a denumerable set U of type 
variables (also called parameters), denoted by a, /3, a finite set of constructors JC, 
where with each K 6 JC an arity m > is associated (by writing K/m) . Basic types 
are type constructors of arity 0. We assume that JC contains a basic type pred. A 
flat type is a type of the form K(a\, . . . , a m ), where K G JC and the a, arc distinct 
parameters. 

The set of type variables in a type r is denoted by V{t). The set of ground types 
Q is the set of types containing no variable. We write r[a/a] to denote the type 
obtained by replacing all the occurrences of a by a in r. We write t[o~] to denote 
that the type r strictly contains the type a as a subexpression. The size of a type 
r, defined as the number of occurrences of constructors and parameters in r, is 
denoted by size(r). 

We now qualify what kind of sub typing we allow. Intuitively when a type a is 
a subtype of a type r, this means that each term in a is also a term in r. The 
sub typing relation < is designed to have certain nice algebraic properties, stated in 
propositions below. We assume an order < on type constructors such that: K/m < 
K' Jm! implies to > to' , and for each K e JC the set \K' | K < K'} has a maximum. 
Moreover, we assume that with each pair K/m < K'/m', an injective mapping 
I'K.K' ■ {1, • ■ ■ j m'} {1, . . . , to} is associated such that ik.K" = >-k.K' ° lk',k" 
whenever K < K' < K" . 

These assumptions mean that as we move up in the hierarchy of type constructors, 
their arity decreases, and the hierarchy needs not be a lattice but a poset with 
suprema. 

The order on type constructors is extended to a structural covariant subtyping or- 
der on types, denoted also by <, defined as the least relation satisfying the following 
rules: 

(Par) a < a a is a parameter 

( Constr ) K{r 1 ^r m )<K'{rl...,r' ml ) K < K , c = c K , K , 

Contravariant type constructors could be defined with a subtyping rule similar 
to rule Constr but with the ordering relation reversed for some arguments, like 
e -g- T L(i) > T i m the premise of the rule for some argument t[. Such contravariant 
type constructors are not considered in this paper. 

Therefore, if int < float then we have list{int) < list(float), list(float) j£ 
list(int), and also list(float) j£ list(a) as the subtyping order does not include 
the instantiation pre-order. Intuitively, a ground type represents a set of expres- 
sions, and the subtyping order between ground types corresponds to set inclusion. 
Parametric types do not directly support this interpretation as it would identify all 
parameters. 
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term 




t_eof_action t_type 



Fig. 1. Part of the type structure for ISO-Prolog. 



The type structure given in ngure l^Tl reDresents a part of the types used for type 
checking ISO-Prolog. The omitted types are the subtypes of atom associated to all 
types, and other types for special values or options. The type list(a) is the only 
parametric type used for ISO-Prolog. Other parametric types are used for typing 
Prolog libraries such as arrays(a), assoc(a, /?), heaps(a, /?), ordsets(a), queues(a), 
etc. 



A type substitution O is an idempotent mapping from parameters to types that 
is the identity almost everywhere. Applications of type substitutions are defined in 
the obvious way. 

Proposition 2.1 

If a < t then crO < tO for any type substitution O. 
Proof 

By structural induction on r. □ 
Proposition 2.2 

If a < t then size(cr) > size(r) . 
Proof 

By structural induction on r. □ 

Our assumption that for each K e JC, the set {K' \ K < K'} has a maximum, 
together with the arity decreasing assumption, entail the existence of a maximum 
supertype for any type: 
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Proposition 2.3 

For each type r, the set {a \ r < a} has a maximum, which is denoted by Max(r). 
Proof 

By structural induction on r. □ 

This means that every <-connected component of types has a root. For example, 
a structure like a < b, c < b,c < d violates the hypothesis if b and d have no 
common supertype serving as a root for the connected component. On the other 
hand that assumption does not assume, nor it is implied by, the existence of a 
least upper bound to types having a upper bound (sup-quasi-lattice hypothesis in 
ISmolka, 1989> ). 

Proposition 2.4 

For all types r and a, Max(r[tr/a]) = Max(r)[Max(cr)/a]. 
Proof 

By structural induction on r. □ 

Note that the possibility of "forgetting" type parameters in subtype relations, as 
in list(a) < term, may provide solutions to inequalities of the form list(a) < a, 
e.g. a — term. However, we have: 

Proposition 2.5 

An inequality of the form a < r[a] has no solution. An inequality of the form 
r[a] < a has no solution if a 6 V r (Max(r)). 

Proof 

For any type a, we have size(cr) < size(r[cr]), hence by Prop 12.21 a ^ i~[cr], that is 
a < t[o\ has no solution. 

For the second proposition, we prove its contrapositive. Suppose r[a] < a has 
a solution, say r[a/a] < a. By definition of a maximum and Prop. l2~31 we have 
Max(a) = Max(T[cr/a]). Hence by Prop.EHI Max(er) = Max(r)[Max(cr)/a]. By the 
rules of subtyping we have a ^ Max(r). Therefore a $ y(Max(r)), since other- 
wise Max(cr) = Max(r)[Max(cr)/a] would contain Max(cr) as a strict subexpression 
which is impossible. □ 

2.2 Well-typed programs 

CLP programs are built over a denumerable set V of variables, a finite set T of 
function symbols, given with their arity (constants are functions of arity 0), and 
a finite set V of program predicate and constraint predicate symbols given with 
their arity, containing the equality constraint —. A query Q is a finite sequence of 
constraints and atoms. A program clause is an expression noted A *— Q where A 
is an atom formed with a program predicate and Q a query. 

A type scheme is an expression of the form Vari, . . . , r n — >r, where a is the set of 
parameters in types r l7 ...,t„,t. We assume that each function symbol / € T, has 
a declared type scheme of the form Van, ■ ■ • , t u ^t, where n is the arity of /, and r 
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is a flat type. Similarly, we assume that each predicate symbol p € V has a declared 
type scheme of the form Vari, . . . , T n —>pred where n is the arity of p. The declared 
type of the equality constraint symbol is Vit u, u—*pred. For notational convenience, 
the quantifiers in type schemes and the resulting type pred of predicates will be 
omitted in type declarations, the declared type schemes will be indicated by writing 
/ti...t„— yt an d Pn...T„ , assuming a fresh renaming of the parameters in t±, . . . , r n , r 
for each occurrence of / or p. 

Throughout this paper, we assume that fC, J 7 , and V are fixed by means of 
declarations in a typed program, where the syntactical details are insignificant for 
our results. 

A variable typing is a mapping from a finite subset of V to T, written as {x\ : 
: T n }. The type system defines well-typed terms, atoms and clauses rel- 
atively to a variable typing U. The typing rules are given in Table ^ The rules 
basically consist of the rules of Mycroft and O'Kecfe plus a subtyping rule. Note 
that for the sake of simplicity constraints are not distinguished from other atoms 
in this system. 



(Sub) 



UH:t t<t' 
UH:t' 



(Var) 



,}hi 



(Func) 



(Head) 
( Query) 



UrtvnQ ... UH n -.T n e 
U\-f Tl ... Tn ^ T (t 1 ,...,t n ):T& 



(Atom) Vthm®. - U^tn-TnQ 

( ° m J U\-p n ... Tn (tx,...,t n )Atom 



uHj-.ne ... uH n -.T n e 

U\-p Tl ... Tn (ti,... 1 t n )Head 

UhAi Atom ... U\-A n Atom 
UhAi,...,A n Query 



O is a type substitution 
O is a type substitution 
O is a renaming substitution 



( Clause) 



U^Q Query UhA Head 
UhA^Q Clause 



Table 1. The type system. 

An object, say a term t, is well-typed if there exist some variable typing U and 
some type r such that U h t : r. Otherwise the term is ill-typed (and likewise for 
atoms, etc.). A program is well-typed if all its clauses are well-typed. 

The distinction between rules Head and Atom expresses the usual definitional 
genericity principle (Laks hman &: Reddy, 1991| ) which states that the type of a 
defining occurrence of a predicate (i.e. at the left of " in a clause) must be 
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equivalent up-to renaming to the assigned type of the predicate. The rule Head used 
for deriving the type of the head of the clause is thus not allowed to use substitutions 
other than variable renamings in the declared type of the predicate. For example, the 
predicate member can be typed polymorphically, i.e. member : a x list(a)—ypred, if 
its definition does not contain special facts like member(l, [1]), that would force its 
type to be member : int x list(int)—>pred, for satisfying the dcfinitional-genericity 
condition. 

The following proposition shows that if an expression other than a clause or a 
head is well-typed in a variable typing U, it remains well-typed in any instance UQ. 

Proposition 2.6 

For any variable typing U, any type judgement R other than a Head or a Clause, 
and any type subtitution 0, if U \~ R then UQ h RQ. 

Proof 

By induction on the height of the derivation tree for U h R. □ 

2.3 Subject reduction w.r.t. CSLD resolution 

Subject reduction is the property that evaluation rules transform a well-typed ex- 
pression into another well-typed expression. The evaluation rule for constraint logic 
programming is CSLD-resolution. To recall this evaluation rule, it is convenient to 
distinguish in a query Q, the constraint part c (where the sequence denotes the 
conjunction) from the other sequence of atoms A. We use the notation Q = c\A to 
make this distinction. Given a constraint domain X which fixes the interpretation 
of constraints, a query c'\B is a CSLD -resolvent of a query c\A and a (renamed 
apart) program clause p(ti, t n ) «— d\A, if 

A = Ai, . . . . .,t' n ), A k+1 , . . .,A m , 

B = Ai, . . . , Ak-x, A, Ak+i, . . . , A m , 

and the constraint c' = (c A d A t-y = t' x A . . . A t n = t' n ) is A'-satisfiable. 
Theorem 2.1 {Subject Reduction for CSLD resolution) 

Let P be a well- typed CLP(<Y) program, and Q be a well-typed query, i.e. U h 
Q Query for some variable typing U. If Q' is a CSLD-resolvent of Q, then there 
exists a variable typing V such that V h Q' Query. 

Proof 

Let us assume without loss of generality that Q = c\p(s),A, and that Q' is a 
CSLD-resolvent of Q with the program clause p(t)<— d\B. 
Thus Q' = c,d,s = t\A,B. 

As Q is well-typed, we have U h c\p(s),A Query. And as the program is well 
typed, there exists a variable typing U" , renamed apart from U , such that U" h 
p(t)<—d\B Clause. 

Let p : T—tpred be the type declaration of predicate p. Since U h p(s) Atom, we 
have (7h s : r9 for some substitution 9. 

Now let U' = U U J7"8. By proposition ESI we have U"Q h Qwery, thus 

[/' h c, B, Query, What remains to be shown is V h s = t Atom. 
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Since U" h p(i) #ead, we have U" h t : r. Hence by proposition 151)1 17" h f : 
r0. Therefore we have {7 h s : r0 and U"Q h * : r0, from which we conclude 

[/' h s = t 4*0771. □ 

It is worth noting that the previous result would not hold without the definitional 
genericity condition (expressed in rule Head) . For example with two constants a : r a 
and b : Tb, and one predicate p : a^pred defined by the non definitional generic 
clause p(a), we have that the query p(b) is well typed, but b = a is a resolvent that 
is ill-typed if r a and Tb have no upper bound. 



2-4 Subject reduction w.r.t. substitutions 

The CSLD reductions, noted — >csld-, are in fact an abstraction of the operational 
reductions that may perform also substitution steps, noted — y a , instead of keeping 
equality constraints. As in the CLP scheme constraints are handled modulo logical 
equivalence (Ja ffar fc Lassez, 1987D , it is clear that the diagram of both reductions 
commutes : 



Qi 

Xcr 



>CSLD Q2 



¥ CSLD 



-Lit 



>CSLD 



>CSLD Qn 



>CSLD 



Xcr 

Q 

However the previous subject reduction result expresses the consistency of types 
w.r.t. horizontal reduction steps only, that is w.r.t. the abstract execution model 
which accumulates constraints, but may not hold for more concrete operations 
of constraint solving and substitutions. For example, with the subtype relations 
int < term, pred < term, the type declarations —: a x a—>pred, p : int—>pred, 
and the program p(X), the query Y — true,p{Y) is well typed with Y : int, and 
succeeds with Y = true, although the query obtained by substitution, p{true), 
is ill-typed. In order to establish subject reduction for substitution steps, and be 
consistent with the semantical equivalence of programs, one needs to consider a 
typed execution model with type constraints on variables checked at runtime. In the 
example, the type constraint Y : int with the constraint Y = true is unsatisfiable, 
the query can thus be rejected at compile-time by checking the satisfiability of its 
typed constraints. 

Definition 2.1 

Given a constraint system over some domain X ', a typed constraint system over 
XU2 X is defined by adding type constraints, i.e. expressions of the form t : t where 
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t is a term and r a type. Basic types are interpreted by distinguished subsets of X 
and type constructors by mappings between subsets of X satisfying the subtyping 
relation < and the type declarations for function and predicate symbols. A type 
constraint t : r is satisfiable if there exists a valuation p of the variables in t and 
the free parameters in r such that tp G rp. A typed constraint system composed 
of type constraints and constraints over X is satisfiable if there exists a valuation 
which satisfies all constraints of the system. 

Lemma 2.1 

In a typed constraint system, X : t /\ X ~ t entails t : r. 
Proof 

For any valuation p, if Xp G rp and Xp — tp then tp G rp. □ 
Definition 2.2 

The TCLP clause (resp. query) associated to a well-typed program (resp. query) 
in a typed environment U is the clause (resp. query) augmented with the type 
constraints in U. 

Theorem 2.2 (Subject Reduction for substitutions) 

Let P be a TCLP program associated to well- typed CLP(<Y) program, and Q be a 
TCLP query, we have U b Q Query for some variable typing U. If Q' is a CSLD- 
resolvent of Q, then the variable typing U' associated to the type constraints in Q' 
gives U' b Q 1 Query. Furthermore if Q' contains an equality constraint X = t then 
U"rQ'[t/X] Query. 

Proof 

Subject reduction for CSLD resolution follows from theorem l2.1l as TCLP programs 
are just a special case of well-typed CLP programs. Furthermore one easily checks 
that the type constraints in Q' , that come from the type constraints in Q and from 
the resolving TCLP clause, give exactly the type environment U' constructed in 
the proof of the previous theorem, thus U' h Q' 'Query. 

Now let X = t be a constraint in a resolvent Q 1 . Let X : r £ U' . We have X : r 
in the constraint part of Q 1 which together with X = t entails t : r by lemma ETT1 
Therefore it is immediate from the typing rules that by replacing X by t in the 
derivation of U' b Q 1 Query, and by completing the derivation with the derivation 
of t : t instead of X : t, we get a derivation of U' b Q'[t/X] Query. □ 

The effect of type constraints in TCLP programs is to prevent the derivation of 
ill-typed queries by substitution steps. In addition, queries such as X : int, X — 
true,p(X) can be rejected at compile-time because of the unsatisfiability of their 
constraints. Similarly TCLP program clauses having unsatisfiable typed constraints 
can be rejected at compile-time. 

Note that in ( Sm aus et al., 2000"| ) another result of subject reduction for substi- 
tutions is shown without the addition of type constraints but in a very restricted 
context of moded logic programs. 
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3 Type checking 

The system described by the rules of Table Q is non-deterministic, since the rule 
Sub can be used anywhere in a typing derivation. One can obtain a deterministic 
type checker, directed by the syntax of the typed program, simply by replacing the 
rule Sub by variants of the rules Func, Atom and Head with the subtype relation 
in their premises. This leads to the following type system in table [21 

(Var) {i:T,...}hi:T 

/E1 „ UHj-.ai ai<Ti<d ... UH n :a n Q- n <T n Q 

(tunc ) fjT f 71 ~t~T. — o, O is a type substitution 

,.. XJVt\\o\ cti<ti6 ... UH n :a n a n <T n Q „ . , ,. 

( Atom ) U\-p Tl ...^(t 1 ,...,t n )Atom 6 is a type substitution 

,„ UHiuti ai<ri6 ... UH n :a n a n <T n Q 

(head I TT\ 77 TTTr — 1 B is a renaming substitution 

1 y U^p Tl ... Tn {ti,...,t n )Head 6 



UhAi Atom ... UhAn Atom 
U\-Ai,...,A n Query 

UhQ Query UhA Head 
U\-A<—Q Clause 

Table 2. The type system in second form. 



( Query) 



( Clause) 



Proposition 3.1 

A program is well typed in the original system if and only if it is well typed in the 
new one. 

Proof 

Clearly, if a program is typable in the new system, it is typable in the original 
one: one has just to replace every occurrence of the {Func') and (Atom') rules 
respectively with the following derivations: 

, a h ,U\-h:n n< T [Q U h t n : r n r n < r' n Q 

, F } {Sub) mum "' { ] ut-tn-.<e 

\ Func ) rrri n n — m 



( Sub > FTuT ' ' ' ( Sub > 77T7 



(Atom) jj p p r , x xt ^^ 1; . . ~Q Atom 

U^nn^ ( S ub)E^^^<® 

(Head) — Etlllli® U^t n --<& 



U F p T {x-XT^(h, ■ ■ ;t n ) Atom 
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Conversely, if a program is typable in the original system, it is typable in the 
second one, noted here h 2 . The proof is by induction on the typing derivation in 
the original system. The rules (Var), (Query) and (Clause) remain the same. The 
rule (Atom) and (Head) are similar to rule (Func). We thus show the property for 
any term t : if U h t : r in the first system, then U h 2 t : t' in the second system 
with t' < t. 

Let us consider the three possible cases, either the proof terminates by the ap- 
plication of the ( Var) rule, by the application of the (Func) rule or by application 
of the (Sub) rule. 

The first case is trivial as the rule ( Var) is the same in both systems. 

In the second case, according to the (Func) rule, U h t\ : T16 • • • U h t n : r n 0. 
Then, by the induction hypothesis, the terms t\ ■ ■ ■ t n are also type checked to 
U h-2 ti : t[ ■ ■ ■ U h 2 t n : r' n by the second system, with r[ < Tj6,i = l..n. By 
applying the (Func) rule with t[ — <7i,i — l..n, we get U h 2 f(t\, ■ ■ -,t n ) : tQ. 

In the third case, according to the (Sub) rule, U h t : r and r < t' allows us to 
deduce U h t : t' . By induction hypothesis, t is type checked to U h 2 t : a in the 
second system, where a < r. Since r < r', we have a < t' . So t is type checked to 
U h 2 t : a, where a < t' ' . □ 

The construction of the substitution 6 needed in rules (Func'), (Atom') and 
(Head') for type checking, can be done by solving the system of subtype inequali- 
ties collected along the derivation of a type judgement. The parameters in the type 
environment (i.e. the parameters in the types of variables) are however not under 
the scope of these substitutions, as they act only on the parameters of the (renamed 
apart) type declarations for function and predicate symbols. We are thus looking 
for type substitutions with a restricted domain. For the sake of simplicity how- 
ever, instead of dealing formally with the domain of type substitutions, we shall 
simply assume that the parameters in the type of variables are replaced by new 
constants for checking the satisfiability of subtype inequalities, and avoid unsound 
instantiations. 

Now let £ be the collection of subtype inequalities < imposed on types by rules 
(Func') (Atom') and (Head') in a derivation. Let us define the size of a system 
of inequalities as the number of symbols. The size of the system £ of inequalities 
associated to a typed program is 0(nvd) where v is the size of the type declarations 
for variables in the program, n is the size of the program, and d is the size of the 
type declarations for function and predicate symbols. 

As the type system is deterministic we have: 

Proposition 3.2 

A well-formed program is typable if and only the system of inequalities collected 
along its derivation is satisfiable. 

It is worth noting that the system of inequalities £ collected in this way for type 
checking have in fact a very particular form. 
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A system £ of inequalities is left-linear if any type variable has at most one occur- 
rence at the left of < in the system. £ is acyclic if there exists a ranking function 
on type variables r :U — > J\f such that if a < t 6 S, a £ V(a) and /3 £ V(t) then 
r(a) < r(/3). 

Proposition 3.3 

The system of inequalities generated by the type checking algorithm is acyclic and 
left-linear. 

Proof 

As the type variables in the types of CLP variables have been renamed into con- 
stants, the only type variables occurring in £ are introduced by rules (Func') 
(Atom') and (Head'), and come from (renamed apart) type declarations of function 
and predicate symbols. We can thus associate to each type variable a a rank h(a) 
defined as the height of its introduction node in the derivation tree (i.e. the maxi- 
mal distance from the node to its leaves). Now a rule (Func'), (Atom') or (Head') 
at height h posts inequalities of the form a < r, where the rank of the variables in 
r is h, and the rank of the variables in a is h — 1. The system is thus acyclic. 

The type variables at the left of < are those parameters that come from the 
result type of a function declaration, e.g. a in nil : list(a). As the result type is 
a flat type, the variables in a result type are distinct and renamed apart, hence 
the variables occurring in a type at the left of < have a unique occurrence in the 
system. The system is thus trivially left-linear. □ 

Note that if we allowed contravariant type constructors, the previous proposition 
would not hold. 

A linear time algorithm for solving acyclic left-linear systems is given in the next 
section. 

4 Subtype inequalities 

The satisfiability of subtype inequalities ( SSI) problem is the problem of determining 
whether a system of subtype relations 1 A™=i T i— T i over types r±, t[, r„, r' n has a 
solution, i.e., whether there exists a substitution 9 such that A™=i r i@< r i@- 

Definition 4-1 

A solution to an inequality r<r' is a substitution such that t0<t'0. A maximal 
solution is a solution such that for any solution 0' there exists a substitution p 
such that Vae7 a0'<a0p. 

The SSI problem has been deeply studied in the functional programming com- 
munity. Due to the lack of results for the general case, special instances of the SSI 
problem have been identified along several axes: 

1 The SSI problem should not be confused with the semi-unification problem which is defined with 
the instantiation pre-ordering, intead of the su btype ordering: 3@ AILi ">"i0Oi = The 
undecidability of semi- unification is shown in l |Kfoury et al., 1989} . 
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• the form of the types: basic types, constructor types, covariant (our case in 
this paper) or contravariant; 

• the structure of the types: (disjoint union of) lattices ( |Tiuryn, 1992| ), quasi- 
lattices ( |Smolka," l989), n-crown ( |Tiuryn,"l9 92), posets with suprema (our 
case), partial orders JFrey, 1997| ); 

• the form of the type constraints. 

In this section we show that the type constraints generated by the type checking 
algorithms can be solved in linear time in our quite general structure of types, and 
that the type constraints generated by the type inference algorithms can be solved 
in cubic time, under the additional assumption that the types form a lattice. 



4-1 The acyclic left-linear case 

We show that the satisfiability of acyclic left-linear subtype inequalities can be 
decided in linear time, and admit maximal solutions in our general type structure 
(T, <) of posets with suprema. 

In this section, we present an algorithm which proceeds by simplification of the 
subtype inequalities and introduces equations between a parameter and a type. We 
say that a system E is in solved form if it contains only equations of the form 

{"i =n,...,a n =T n } 

where the a^s are all different and have no other occurrence in E. The substitution 
©s = {oti <— ti, . . . , a n <— r„} associated to a system in solved form E is trivially a 
maximal solution. We show that the following simplification rules compute solved 
forms for satisfiable acyclic left-linear systems: 

(Decomp) E, K( n ,...,r m ) < K'(t{, <), — » E, Ati < < 
if K < K' and l = Lk,k' ■ 

(Triv) E, a < a, — > E 

(VarLeft) E, a < t, — > a = r, E[r/a] 
if t ^ a, a $ V(t). 

(VarRight) E, r < a, — > a = Max(r), E[Max(r)/a] 

if r £ V, a V(l) for any / < r G E, and a (£ V(Max(r)). 

Lemma 4-1 

The rules terminate in 0(n) steps, where n is the sum of the sizes of the terms in 
the left-hand side of inequalities. 

Proof 

It suffices to remark that each rule strictly decreases the sum of the size of the terms 
in the left-hand sides of the inequalities: (Triv) and (VarLeft) by one, (Decomp) by 
at least one, and (VarRight) by the size of r. □ 
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One can easily check that each rule preserves the left-linearity as well as the 
acyclicity of the system, moreover: 

Lemma 4-2 

Each rule preserves the satisfiability of the system, as well as its maximal solution 
if one exists. 

Proof 

Rules (Decomp) and (Triv) preserve all solutions, by definition of the subtyping 
order. Rule (VarLeft) replaces a parameter a by its upper bound t. As the system is 
left-linear this computes the maximal solution for a, and thus preserves the maximal 
solution of the system if one exists. Rule (VarRight) replaces a parameter a having 
no occurrence in the left-hand side of an inequality, hence having no upper bound, 
by the maximum type of its lower bound r; this computes the maximal solution for 
a, and thus preserves also the maximal solution of the system if one exists. □ 

Theorem 4-1 

Let E be an acyclic left-linear system. Let E' be a normal form of E. Then E is 
satisfiable iff E' is in solved form, in which case Oe' is a maximal solution of E. 

Proof 

Consider a normal form E' for E. If E' contains a non variable pair r < r', as this 
inequality is irreducible by (Decomp) E' has no solution, hence E is unsatisfiable 
by lemma E^l Similarly E' has no solution if it contains an inequality a < r with 
a G V(t) and r a (prop. 12. 5|) or an inequality r < a with a £ V A (Max(r)) 
and r a (Rrop. EH?) l. In the other cases, by irreducibility and by acyclicity, E' 
contains no inequality, hence E' contains only equalities that are in solved form, 
and the substitution associated to E' is a maximal solution for E. □ 

4-2 The general case 

In absence of subtype relations between type constructors of different arities, check- 
ing the consistency of general subtype inequalities in finite types has been shown 
by Frey ( |Frey, 1997| ) Pspace-complete in an arbitrary poset, with a generalization 
of Fuh & Mishra's algorithm | |Fuh fe Mishra, 1988| ). 

It is an open problem whether the technique used by Frey for proving consistency 
in arbitrary posets can be generalized to our case with subtype relations between 
type constructors of different arities. 

If we assume however that the subtyping relation is a lattice, it has been shown 
by Pottier ( Pottie ~ 2000a| ) that the satisfiability of subtype inequalities can be 
checked in cubic time in the structure of infinite regular trees, i.e. recursive types 
( |Amadio fc Cardelli, 19931 ) ■ Note that recursive types admit solutions to equations 
of the form a — list(a), namely the type list(list(...)). Below we present Pottier's 
algorithm by a set of simplification rules, and show that in acyclic systems the 
solving of (covariant) subtype constraints on infinite types is equivalent to the 
solving on finite types. 

We assume that the structure of type constructors (IC, <) is a lattice with _L 
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and T types. We maintain our previous assumption on decreasing arities, ex- 
cept on _L which is below all (n-ary) type constructors. We also assume that if 
K"/n = glb(K,K') then ra,nge(iK" ,k ) U ra,nge(iK" ,K') = that is greatest 

lower bounds do not introduce new parameters. Similarly, if K"/n = lub(K, K') 
then ra,nge(iK,K" ) U ranged', if") = [L«], Note that there is no loss of generality 
with this assumption as the lattice of type constructors can always be completed 
by introducing gib and lub constructors with the right number of parameters. 

We consider systems of subtype inequalities between parameters of flat types, 
that is of the form a < 0, K(a\, ...a n ) < a or a < K{a\, ...a n ). Non flat types 
can be represented in this form by introducing new parameters and inequalities 
between these parameters and the type they represent. The simplification rules are 
the following: 

(Trans) S, a < 0, 0<j — ► E, a < 0, < 7, a < 7 
if a < 7 ^ E and a 7^ 7. 

(Clash) E, if(ai, a m ) < a, a < if '(a^ , a' n ) — > false 
if if ^ if'. 



(Dec) E, if (ai, a m ) < a, a < K'(a[, ...,a' n ) — > 

E, K(ati,...,a m ) < a, a < if'(ai, <), Nl=i a L{j) < a 'j 



if if < if', i = t^X' and {a t y) < n'}, , „ £ E. 

(Gib) E, a < 0, a < K{a u a m ), /? < if'K, <) — > 

E, a</3, a < if "«,...,<): /? < ^'("i, -. O, AjeJ a "(j) < a 'j 

if if" = glb(if, if') and if" ^ if or {a'[ l(j) < aj}jgj £ E U {a < 0} 

where 1 = Lk",k, tf — tK",K', J — {j\<-'(j) & range(t)} and 

for all 1 < k < I, a'l = on if t(i) — k, a'l = a'j if j 6 J and = k. 

(Lub) E, a<0, if (ai, a m ) < a, K'(a[, a' n ) < — > 

E, a < 0, K(a u a m ) < a, K"(a'{, a'{) < 0, A, e/ a l(i) < a'( 

if if" = lub(if , if') and if" ^ if' or {a l(i) < a'(} ieI <f_ E U {a < 0} 
where 1 = lk.k", ^ — i^k',k", I = {i £ [l,m]\i(i) ^ range(t')} and 
for all 1 < k < I, a'l = a'j if u'(j) = k, a'l = a>i if i e I and t(i) = k 

Rule (Trans) computes the transitive closure of inequalities between parameters 
and is mainly responsible for the cubic time complexity. Rule (Clash) checks the 
consistency of the lower and upper bounds of parameters. Rule (Dec) decomposes 
flat types. Rule (Gib) and (Lub) compute the greatest lower bound of upper bounds 
of parameters and the least upper bound of lower bounds. We remark that if the 
algorithm is applied to an initial system E containing a unique inequality of the 
form t < a and a < r' for each parameter a, the algorithm maintains a unique 
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upper and lower bounds for each parameter. We note lb(a) (resp. ub(a)) the lower 
(resp. upper) bound of a in the system in irreducible form. 

Proposition 4-1 
The rules terminate. 

Proof 

Termination with rule (Clash) is trivial. For the other rules, let us consider, as com- 
plexity measure of the system, the couple of integers (i, e) ordered by lexicographic 
ordering, where e, the "entropy" of the system, is the number 2" — n, where v is 
the number of parameters in the system, n is the number of inequalities between 
parameters, and where t, the "temperature" of the system, is the sum of the height 
of constructors at the right of <, and of the depth of constructors at the left of <. 
The height (resp. depth) of a constructor is its distance to _!_ (resp. T) in (/C, <). 
We show that no rule increases the temperature of the system, and each rule either 
decreases t or e. 

Rule (Trans) does not change t and decreases e by 1, Rule (Dec) does not change t 
and decreases e by at least 1, Rules (Gib) either decreases t if K" ^ K' or decreases 
e otherwise, and similarly for rule (Lub). Hence the algorithm terminates. □ 

Theorem 4-2 

( |Pottier, 2000a| ) A system of inequalities is satisfiable over infinite regular trees if 
and only if the simplification rules do not generate false, in which case the identifi- 
cation of all parameters to their upper bound ub(a) (resp. their lower bound lb(a)) 
provides a maximum (resp. minimum) solution. 

Furthermore, one can show that in our setting of acyclic systems and covariant 
constructor types, the solving of subtype constraints on infinite types is equivalent 
to the solving on finite types. 

Theorem 4-3 

An acyclic system of inequalities is satisfiable over finite types if and only if the 
simplification rules do not generate false, in which case the identification of all 
parameters to their upper bound (resp. lower bound) provides a maximum (resp. 
minimum) finite solution. 

Proof 

It is sufficient to remark that the simplification rules preserve the acyclicity of the 
system, and that in an acyclic system, the identification of the parameters to their 
bounds creates finite solutions. □ 

Corollary 4-1 

In a lattice structure without _L, an acyclic system of inequalities is satisfiable 
over finite types if and only if the simplification rules do not generate false and 
ub(a) ^ _L for all parameters a. 
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5 Type inference 

As usual with a prescriptive type system, type reconstruction algorithms can be 
used to omit type declarations in programs, and still check the typability of the pro- 
gram by the possibility or not to infer the omitted types ( |Lakshman fc Reddy, 1991] ) 
. Below we describe algorithms for inferring the type of variables and predicates, 
assuming type declarations for function symbols. 



5.1 Type inference for variables 

Types for variables in CLP clauses and queries can be inferred by introducing 
unknowns for their type in the variable typing, and by collecting the subtype in- 
equalities along the derivation of the type judgement just like in the type checking 
algorithm. 

It is easy to check that the system of subtype inequalities thus collected is still 
acyclic, as the unknown types for CLP variables appear only in left positions. The 
system is however not left-linear if a CLP variable has more than one occurrence 
in a clause or a query. 

The second algorithm of the previous section can thus be used to infer the type 
of variables in CLP clauses and queries. 



5.2 Type inference for predicates 

Types for predicates can be inferred as well under the assumption that predicates 

are used monomorphically inside their (mutually recursive) definition jLakshman fc Reddy, 199ll > 

This means that inside a group of mutually recursive clauses, each occurrence (even 

in the body of a clause) of a predicate defined in these clauses must be typed 

with rule Head instead of rule Atom. The reason for this restriction, similar to the 

one done for inferring the type of mutually recursive functions in ML, is to avoid 

having to solve a semi-unification problem: i.e. given a system of types Tj, t[ for 

i, 1 < i < n, finding a substitution such that for all i there exists a substitution 

Qi s.t. nQQi = t-Q, that is proved undecidable in ( |Kfoury et ai, 1989| ). 

Note that the SSI obtained by collecting the subtype inequalities in the derivation 
of typing judgements is still acyclic, as the unknown types for predicates appear 
only in the right-hand sides of the inequalities. The second algorithm of the previous 
section can thus be used also to infer the type of predicates in CLP programs under 
the assumption that the structure of types is a lattice without _L. 

One consequence of the acyclicity of the system however, is that the maximum 
type of a predicate is always T. Indeed in our type system a predicate can always 
be typed as maximally permissive. In the more general structure of posets with 
suprema, unless the unknown types for predicates are compared with types be- 
longing to different <-connected components (in which case the predicate is not 
typable), the substitution of an unknown type by the root of its <-connected com- 
ponent is always a solution. But in all cases, this is obviously not a very informative 
type to infer. 
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Our strategy is to infer two types for predicates: the minimum type of the pred- 
icate and a heuristic type. The type inference algorithm proceeds as follows: 

• Firstly, the minimum type of the predicate is obtained by computing the min- 
imum solution of the SSI associated to the typing of the complete definition 
of the predicate. The minimum type of the ith argument of the predicate 
is the type = lb(ai) where on is the unknown type associated to the ith 
argument of the predicate in the SSI. This minimum type is a lower bound of 
all possible typings of the predicate. 

• Secondly, the heuristic type is computed. This type can be parametric. It is 
computed in two steps: 

— First a heuristic upper type is computed for the predicate. The heuristic 
upper type fi of the ith argument of the predicate is obtained by collecting 
the upper types {ub(rxi), — , ub(Tx„)} 01 au *he variables {X\, X n } 
which occur in the ith position of the predicate in its definining clauses. 
Let r = glb{ub(rx 1 ), ...,ub(Yx„) be the greatest lower bound of the types 
of the variable arguments. We set 

Ti = T if r = T and r.j = _L, 
= t j: if t = T and r, =/= _L, 
= T if Ll £ T, 

= T if the identification r$ = r creates a cycle, 
= r otherwise. 

— Then the heuristic type is computed by inferring a possibly parametric 
type from the SSI associated to the heuristic upper type. The candidates 
for parametric types are the parameters bounded by _L and T in the SSI 
associated to the heuristic upper type. Each candidate is checked itera- 
tively by replacing it with a new constant and by identifying all parameters 
which have the new constant in one of their bounds. 

Although tedious, one can easily check that the conditions imposed in the defi- 
nition of the heuristic type create sound typings. The heuristic types thus provide 
correct type declarations for type checking the program. 

6 Implementation of the type system 

6. 1 The Wallace library for solving subtype inequalities 

Our current implementation uses the Wallace library by F.Pottier ( |Pottier, "2 000b) 
for solving the subtype inequalities for type inference and type checking. In both 
cases, the set of type constructors (JC, <) has thus to be a lattice as described in 
section^] Note that the type system did not require that condition: < could be any 
arity decreasing order relation on /C. 

As required in the type inference algorithm, the T element is distinguished from 
the type term which stands for all Prolog terms. The type _L is not considered as a 
valid typing as it is an empty type. 

Note that the Wallace library authorizes constrained type schemes, like for exam- 
ple + : Va < float a x a— >a, which expresses the resulting type of + as a function of 
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the type of its arguments. For the sake of simplicity, we do not consider constrained 
type schemes in this paper. 

6.2 The type checker 

The type checker first reads the Prolog files and deduces the files containing type 
information to load. There is one file for each Prolog file source plus one file for 
each module used (as :- use_module (somemodule) in Sicstus Prolog). The system 
then loads the type files and builds the structure of type constructors. 

The type checker does not impose to give the type of CLP variables in clauses 
and queries. Instead the type of variables is inferred as described in sectional The 
environment U is built with type unknowns for variables. The subtype inequality 
system is collected by applying the rules of the type system and at each step, 
Wallace is used to solve the type constraints. 

One difficulty appears for checking the definitional genericity condition. A type 
error must be raised when the definition of a predicate uses, as argument of the 
head of the clause, a term whose type r is a subtype of an instance of the declared 
type t' for this argument, and not just of a renaming. But Wallace is not able to 
make the difference between being a subtype of an instance or of a renaming of a 
type t'. The following consideration allows us to work around this difficulty. If r 
is a subtype of a renaming of r', for all instances t'Q' of r' there must exist an 
instance t0 of r such that rO < t'Q'. For checking definitional genericity, we thus 
replace each parameter a appearing in the declared type of the head predicate by 
a constructor K a , that does not appear in the program, and such that : 

• K a ^ term, and n a ^ /i for all constructor /x, [i ^ n a , n ^ term 

• /i ^ K a for all fi ^ n a . 

If the rule (Atom) can be applied, using the transformed type, then the rule 
(Head) can be applied as well with the original type. 

6.3 Type inference for predicates 

As described in section [5J two types are infered for predicates: a minimum type 
which is a lower bound of all possible typings of the predicate, and a heuristic type 
which may be parametric. 

If type inference is just displayed for user information, we print both types. If it 
is used for typing automatically the program in a non-interactive manner, then we 
choose the heuristic bound, since it is the most permissive type. 

7 Experimental results 

7. 1 Detection of programming errors 

Here we show a small catalog of the kind of programming errors detected by the 
type checker. 
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7.1.1 Inversion of arguments in a predicate or a function 

This error can be detected when, for example, a variable occurs in two positions 
that have incompatible types. 

Example 7.1 

Consider the following clause where the arguments of the length predicate have 
been reversed. 

p(Ll,L2,N) :- append (LI, L2.L3) , length (N,L3) . 

with the usual declarations : 

append : list(a) x list(a) x list(a) — > pred 
length : list(a) x int — ► pred 

By the rule (Atom'), the variable L3 must be of both types list(a) and int. In 
the type hierarchy we use, there is no type smaller than list and int. The subtype 
inequalities in the premise of rule (Atom') are thus unsatisfiable and a type error 
is raised. 

Note that this example motivates the discard of type _L : otherwise, no error 
would be detected on variables, since the empty type _L could always be inferred 
for the type of any variable. 

7.1.2 Misuse of a predicate or a function 

This error is detected when a term of a type t appears as an argument of a predicate, 
or of a functor that expects an argument of type t', but r j£ r'6 for any substitution 

e. 

Example 7.2 

Consider the following clause : 

p(X,Y) :- Y is (3.5 // X) . 

With type declarations: 

' // ' : int x int —> int for integer division, 
is : float x float — > pred. 

We try to use a float (3.5) where an int is expected. The rule (Atom') does not 
apply. 

This kind of error can be detected also inside call to foreign predicates, through 
the Prolog interface with the C programming language for example. 

Example 7.3 

Consider the declaration of a predicate p defined in C using the Sicstus - C interface : 

foreign(p, p (+integer) ) . 
Such a declaration is interpreted as a type declaration for p : 
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p : int — > pred 
Then a call in a program such as : 
:- p(3.14) . 

raises a type error since the argument is a float and the predicate expects an int. 

7.1.3 Wrong predicate definition w.r.t. the declared type 

This error is detected by two ways, corresponding to the two preceding kinds of 
errors. In the two following examples, the predicate p has been declared with type 
int — > pred. 

Example 7.4 

Let p be defined by p ( [] ) . 

Here the term [] is used as an argument of p, which requires that p accepts ar- 
guments of type atomic Jist. But atomic Jist ^ int and the rule (Atom') does not 
apply. 

Example 7.5 

Let p be defined by : 

p(X) :- length (X, 2) . 

with length : list(a) x int — > pred 

In this case, we will infer a type for X that must be smaller than list(a) (using 
rule (Atom'), because X is used by length) and smaller than int (using the rule 
(Head')). As before, these types have no common subtypes, and an error is raised. 

7.1.4 Violation of the definitional genericity condition 

Example 7.6 
Let : 

p([l]). 

with p : list(a) — > pred 

Although the argument of p is a list, but its type is list(int), an instance and not 
a renaming of list{a) (because int ^ K a ). 

This error can also be detected when a variable is in the head of the clause : 

Example 7.7 
Let : 

p([X]) :- X < 1. 

with : 

p : list{a) — > pred 

< : float x float — > float 

The variable X must be of type float and K a . The only common subtype is _L and 
an error is raised. 
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7.2 Type checked programs 

To test our system, we first tried it on 20 libraries of Sicstus Prolog, that is around 
600 predicates. Then we type checked an implementation of CLP(FD) written com- 
pletely in Prolog, using a lot of meta-predicates, that contains around 170 predi- 
cates. These tests where done using type declarations for around 100 built-in ISO 
Prolog predicates and for some more built-in Sicstus predicates. 

Some type errors obtained in the libraries came from the overloading of some 
function symbols. For example, the function ' - ' /2 is used for coding pairs as well as 
for coding the arithmetic operation over numbers. Another example of overloading 
comes from options : it happens that some terms are common to two sets of options, 
of types Ti, T2- In this case, it is enough to create a subtype r of both t\ and ra, 
and tell that the common terms are of type r. 

We also skip the type checking of some particular declarations, such as mode 
declarations (which are not used by our type system) : 

Example 7.8 

:- mode p(+,-,+) , q(-,?). 

These declarations can be typed in another type structure for mode declarations, 
but not in the same type structure as the one for predicates, since the predicate 
symbols p, q, +, - are clearly overloaded in such declarations. 

7.3 Type inference for predicates 

As said in section l|6.3J) . we infer an interval of types for predicates. Both bounds 
of the interval may offer interesting information. 

Example 7.9 

append( [Head I Tail], List, [Headl Rest]) :- 

append(Tail, List, Rest). 
append([], List, List). 

Minimum type: list (bottom) , list (bottom) , list (bottom) -> pred 
Heuristic infered type: list (A), list (A), list (A) -> pred 

Example 7.10 

sum_list([], Sum, Sum). 

sum_list ( [Head I Tail], SumO, Sum) : - 

Suml is Head+SumO, sum_list (Tail , Suml , Sum). 

Minimum type: list (bottom) , bottom, bottom -> pred 
Heuristic infered type: list (float), float, float -> pred 

Sometimes, the heuristic infers a too permissive type. This is in particular the 
case with overloaded arithmetic predicates expressions, that are always typed as 
float, not int. 
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Example 7.11 
length([] ,0) . 

length([_ I Tail] ,R) :- length (Tail, L), R is L+l. 

Minimum type: list (bottom) , int -> pred 
Heuristic infered type: list (A), float -> pred 

The heuristic may also infer a type which is too restrictive. 

Example 7.12 

is_list(X) :- var(X), !, fail. 
is_list( [] ) . 

is_list([_|Tail]) :- is_list (Tail) . 

Minimum type: list (bottom) -> pred 
Heuristic infered type: list (A) -> pred 

This is a typical example where the maximum type, here 

is_list: term -> pred 

is in fact the intended type. 

These examples should clearly justify the heuristic approach to type inference for 
predicates in a prescriptive type system. 

Finally, the interesting flatten predicate illustrates the remarkable flexibility of 
the type system. 

Example 7.13 
flatten([] , [] ) :- !. 

flatten ( [X I L] ,R) :- !, f latten(X,FX) , flatten (L,FL) , append (FX, FL, R) . 
flatten (X,R) :- R= [X] . 

Minimum type : list (bottom) , list (bottom) -> pred 
Heuristic infered type : term, list (term) -> pred 

7-4 Benchmarks 

The following table sums up our evaluation results. The first column indicates 
the type checked Prolog program files. The second column indicates the number 
of predicates defined in each file first, and then the maximum number of atoms 
by clause and by complete connected component. The third column indicates the 
CPU time in seconds for type checking the program with the type declarations for 
function and predicate symbols. The fourth column indicates the CPU for inferring 
the types of predicates with the type declarations for function symbols only. The 
last column indicates the percentage of predicates for which the infered type is 
exactly the intended type. 

The last test file is another implementation of CLP(FD) on top of prolog which 
uses a lot of metaprogramming predicates. 
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File 


# predicates 
max # atoms 


Type Checking 


Type Inference 


% exact types 






arrays.pl 


13 9/16 


2.18 s 


11.91 s 


23 % 






assoc.pl 


31 11/24 


5.29 s 


40.13 s 


68 % 






atts.pl 


14 20/119 


7.43 s 


77.47 s 


n/a 






bdb.pl 


101 27/27 


23.56 s 


41.10 s 


64 % 






charsio.pl 


15 7/7 


1.27 s 


2.21 s 


33 % 






clpb.pl 


59 20/77 


24.35 s 


1827.32 s 


n/a 






clpq 


396 39/160 


355.12 s 


4034.37 s 


n/a 






clpr 


439 39/160 


304.45 s 


3958.41 s 


n/a 






fastrw.pl 


4 5/7 


0.44 s 


0.76 s 


100 % 






heaps.pl 


21 8/18 


3.49 s 


43.33 s 


71 % 






jasper.pl 


32 11/11 


7.43 s 


11.97 s 


84 % 






lists.pl 


39 6/9 


2.23 s 


16.17 s 


97 % 






ordsets.pl 


35 7/18 


7.43 s 


199.38 s 


97 % 






queues.pl 


12 11/18 


1.37 s 


4.12 s 


75 % 






random.pl 


11 18/18 


2.43 s 


4.12 s 


55 % 






sockets.pl 


24 15/27 


6.79 s 


15.43 s 


n/a 






terms.pl 


13 18/27 


6.96 s 


308.69 s 


77 % 






trees.pl 


13 6/15 


3.07 s 


12.64 s 


31 % 






ugraphs.pl 


87 12/24 


48.21 s 


274.22 s 


67 % 




| clp-fd.pl 


163 20/71 


24.35 s 


59.65 s 


n/a 





Tabic 3. Benchmarks. 

The same algorithm is used for solving the systems of subtype inequalities for 
type checking and type inference. The difference between computation times comes 
from the handling of complete connected components of definitions for type infer- 
ence, whereas for type checking, clauses are type checked one by one. In particular 
CLP(R) and CLP(Q) have very large mutually recursive clauses. 

In the library for arrays, the low percentage of exact matches between the infered 
type and the intended type is simply due to the typing of indices by float instead 
of int. The errors in the other libraries are also due to the typing of arithmetic 
expressions by float, and sometimes to the use of the equality predicate = a>a which 
creates a typing by term for some arguments instead of a more restrictive typing. 

In the library CLP(FD), finite domain variables are typed with type int. Sim- 
ilarly in the library CLP(R), variables over the reals are typed with type float. 
One consequence is that the type checker then allows coercions from finite domain 
variables to real constraint variables. To make these coercions work in practice one 
modification in the CLP(R) library was necessary. 

8 Conclusion 

Typing constraint logic programs for checking programming errors statically while 
retaining the flexibility required for preserving all the metaprogramming facilities 
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of logic programming and the usual coercions of constraint programming, is the 
challenge that conducted the design of the type system presented in this paper. Our 
experiments with the libraries of Sicstus Prolog have shown that the type system 
is simple and flexible enough to accept a large variety of constraint logic programs. 
The main difficulties are located to conflicts of overloading for some predicates or 
functions. Such ad hoc polymorphism could be resolved by considering disjunctive 
formula over types ( |Demoen et al, 1*9 99 ). Examples have been given also to show 
that the type system is useful enough for detecting programming errors such as the 
inversion of arguments in a predicate, or the unintended use of a predicate. 

The price to pay for this flexibility is that our type system may be regarded 
as too permissive. Some intuitively ill-typed queries may be not rejected by the 
type system. We have analyzed these defects in terms of the subject reduction 
properties of the type system. In particular we have shown that the addition of the 
typing constraints on variables to well-typed programs and queries suffices to state 
subject reduction w.r.t. both CSLD resolution and substitution steps, and has for 
effect to reject a larger set of clauses and queries by checking the satisfiability of 
their constraints with the type constraints at compile-time. 

The lattice assumption for the type structure, due to the implementation in Wal- 
lace of subtype constraints, may be regarded also as too demanding in some cases. 
We have already relaxed that assumption by rejecting the bottom element from 
the structure of types. Nevertheless the decidability of subtype constraints under 
more general assumptions is an interesting open problem. In particular, whether 
the method of Frey < |Frey, 1 997) can be extended to cover subtype relations be- 
tween type constructors of different arities, as required in our approach, is an open 
question. 

Finally, it is worth noting that the results presented here are not limited to logic 
programming languages. They should be relevant to various constraint program- 
ming languages, where the main difficulty is to type check constraint variables, that 
express the communication between different constraint domains. 
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